opt: five IR and LLVM optimisations for bfc#5
Merged
Conversation
Adds optimise_program() called from both bfc and bfi after parsing. The first pass (cancel_opposing) merges adjacent INC/DEC and RIGHT/LEFT pairs, subtracting their counts and removing pairs that fully cancel. Bracket jump indices are remapped after compaction.
Adds detect_clear_loops() pass: a loop whose body is a single INC or DEC (any count) is replaced with the synthetic CMD_CLEAR node, emitting a single store i8 0 in LLVM IR and a direct zero-assignment in the interpreter. Updates test_simple_loop.filecheck to match.
Adds detect_multiply_loops() pass: a loop whose body contains only
+/-/</> with net pointer delta 0 and loop-counter delta -1 is
replaced with CMD_MULTIPLY. Each non-counter cell touched becomes an
{offset, factor} pair. Codegen emits counter load, multiply-adds, then
store i8 0. Supports up to MULTIPLY_MOVES_MAX (8) target cells, offsets
within ±64. Adds test/res/multiply.b and test/test_multiply.filecheck.
Removes the @dp global variable and replaces it with an alloca in the main function entry block. The LLVMValueRef ctx->dp is still a pointer so all load/store callsites are unchanged. With LLVM's mem2reg pass (applied under -O) the alloca is promoted to an SSA register, removing all dp memory traffic. Updates all FileCheck tests accordingly.
Wires the already-parsed --optimise flag from main_bfc.c through generate(program, optimise). When true, runs mem2reg,instcombine, simplifycfg,gvn via LLVMRunPasses (new pass manager, LLVM >= 14). mem2reg promotes the dp alloca to SSA registers; gvn eliminates redundant loads; instcombine and simplifycfg clean up the result. Adds the passes component to llvm_map_components_to_libnames. FileCheck tests are unaffected as they do not pass -O.
Replace (unsigned long long) casts with (uint64_t) to satisfy cpplint runtime/int rule. Re-run clang-format to fix indentation in multiply(), detect_multiply_loops(), and the CMD_MULTIPLY interp case.
The clang static analyzer in CI flags SIZE_MAX (ir.c) and uint8_t (interp.c) as undeclared without an explicit stdint.h include.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
+/-and>/<pairs are merged/cancelled in the IR phase (e.g.+++--→+1).CMD_CLEAR: the[-]/[+]zero-cell idiom is replaced with a singlestore i8 0instead of a full loop.CMD_MULTIPLY: multiply-add loops (e.g.[->+<]) are detected and replaced with counter load +mul/add+ zero, eliminating the loop entirely. Supports up to 8 target cells within ±64 offset.dpasalloca: the data-pointer is moved from a global to anallocainmain, enabling LLVM'smem2regpass to promote it to a register.-O/--optimise): wires the already-parsed flag throughgenerate()and runsmem2reg,instcombine,simplifycfg,gvnviaLLVMRunPasses. Under-O,helloworld.bproduces completely loop-free IR with constant-folded GEPs and nodploads.IR-level passes (1–3) always run; LLVM passes (5) are gated on
-O. All existing tests pass and a newtest/test_multiply.filecheckis added.Test plan
cmake --build build --target testspasses (all FileCheck, unit, and expect tests)build/bfc test/res/helloworld.bproduces valid IR that compiles and runs correctlybuild/bfc -O test/res/helloworld.bproduces noticeably simpler IR (no%dploads, no loop blocks)build/bfi test/res/factorial.bstill produces correct output